skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Zhao, Xuandong"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available September 18, 2026
  2. Text watermarks for large language models (LLMs) have been commonly used to identify the origins of machine-generated content, which is promising for assessing liability when combating deepfake or harmful content. While existing watermarking techniques typically prioritize robustness against removal attacks, unfortunately, they are vulnerable to spoofing attacks: malicious actors can subtly alter the meanings of LLM-generated responses or even forge harmful content, potentially misattributing blame to the LLM developer. To overcome this, we introduce a bi-level signature scheme, Bileve, which embeds fine-grained signature bits for integrity checks (mitigating spoofing attacks) as well as a coarse-grained signal to trace text sources when the signature is invalid (enhancing detectability) via a novel rank-based sampling strategy. Compared to conventional watermark detectors that only output binary results, Bileve can differentiate 5 scenarios during detection, reliably tracing text provenance and regulating LLMs. The experiments conducted on OPT-1.3B and LLaMA-7B demonstrate the effectiveness of Bileve in defeating spoofing attacks with enhanced detectability. 
    more » « less
  3. We present an approach for estimating the fraction of text in a large corpus which is likely to be substantially modified or produced by a large language model (LLM). Our maximum likelihood model leverages expert-written and AI-generated reference texts to accurately and efficiently examine real-world LLM-use at the corpus level. We apply this approach to a case study of scientific peer review in AI conferences that took place after the release of ChatGPT: ICLR 2024, NeurIPS 2023, CoRL 2023 and EMNLP 2023. Our results suggest that between 6.5% and 16.9% of text submitted as peer reviews to these conferences could have been substantially modified by LLMs, i.e. beyond spell-checking or minor writing updates. The circumstances in which generated text occurs offer insight into user behavior: the estimated fraction of LLM-generated text is higher in reviews which report lower confidence, were submitted close to the deadline, and from reviewers who are less likely to respond to author rebuttals. We also observe corpus-level trends in generated text which may be too subtle to detect at the individual level, and discuss the implications of such trends on peer review. We call for future interdisciplinary work to examine how LLM use is changing our information and knowledge practices. 
    more » « less
  4. Raloxifene (RAL) reduces clinical fracture risk despite modest effects on bone mass and density. This reduction in fracture risk may be due to improved material level-mechanical properties through a non-cell mediated increase in bone hydration. Synthetic salmon calcitonin (CAL) has also demonstrated efficacy in reducing fracture risk with only modest bone mass and density improvements. This study aimed to determine if CAL could modify healthy and diseased bone through cell-independent mechanisms that alter hydration similar to RAL. 26-week-old male C57BL/6 mice induced with chronic kidney disease (CKD) beginning at 16 weeks of age via 0.2 % adenine-laced casein-based (0.9 % P, 0.6 % C) chow, and their non-CKD control littermates (Con), were utilized. Upon sacrifice, right femora were randomly assigned to the following ex vivo experimental groups: RAL (2 μM, n = 10 CKD, n = 10 Con), CAL (100 nM, n = 10 CKD, n = 10 Con), or Vehicle (VEH; n = 9 CKD, n = 9 Con). Bones were incubated in PBS + drug solution at 37 ◦C for 14 days using an established ex vivo soaking methodology. Cortical geometry (μCT) was used to confirm a CKD bone phenotype, including porosity and cortical thinning, at sacrifice. Femora were assessed for mechanical properties (3-point bending) and bone hydration (via solid state nuclear magnetic resonance spectroscopy with magic angle spinning (ssNMR)). Data were analyzed by two-tailed t-tests (μCT) or 2-way ANOVA for main effects of disease, treatment, and their interaction. Tukey's post hoc analyses followed a significant main effect of treatment to determine the source of the effect. Imaging confirmed a cortical phenotype reflective of CKD, including lower cortical thickness (p < 0.0001) and increased cortical porosity (p = 0.02) compared to Con. In addition, CKD resulted in weaker, less deformable bones. In CKD bones, ex vivo exposure to RAL or CAL improved total work (+120 % and +107 %, respectively; p < 0.05), post-yield work (+143 % and +133 %), total displacement (+197 % and +229 %), total strain (+225 % and +243 %), and toughness (+158 % and +119 %) vs. CKD VEH soaked bones. Ex vivo exposure to RAL or CAL did not impact any mechanical properties in Con bone. Matrix-bound water by ssNMR showed CAL treated bones had significantly higher bound water compared to VEH treated bones in both CKD and Con cohorts (p = 0.001 and p = 0.01, respectively). RAL positively modulated bound water in CKD bone compared to VEH (p = 0.002) but not in Con bone. There were no significant differences between bones soaked with CAL vs. RAL for any outcomes measured. RAL and CAL improve important post-yield properties and toughness in a non-cell mediated manner in CKD bone but not in Con bones. While RAL treated CKD bones had higher matrix-bound water content in line with previous reports, both Con and CKD bones exposed to CAL had higher matrix-bound water. Therapeutic modulation of water, specifically the bound water fraction, represents a novel approach to improving mechanical properties and potentially reducing fracture risk. 
    more » « less
  5. Evans, Robin J.; Shpitser, Ilya (Ed.)
    Most existing approaches of differentially private (DP) machine learning focus on private training. Despite its many advantages, private training lacks the flexibility in adapting to incremental changes to the training dataset such as deletion requests from exercising GDPR’s right to be forgotten. We revisit a long-forgotten alternative, known as private prediction, and propose a new algorithm named Individual Kernelized Nearest Neighbor (Ind-KNN). Ind-KNN is easily updatable over dataset changes and it allows precise control of the Rényi DP at an individual user level — a user’s privacy loss is measured by the exact amount of her contribution to predictions; and a user is removed if her prescribed privacy budget runs out. Our results show that Ind-KNN consistently improves the accuracy over existing private prediction methods for a wide range of epsilon on four vision and language tasks. We also illustrate several cases under which Ind-KNN is preferable over private training with NoisySGD. 
    more » « less
  6. Large language models are shown to memorize privacy information such as social security numbers in training data. Given the sheer scale of the training corpus, it is challenging to screen and filter these privacy data, either manually or automatically. In this paper, we propose Confidentially Redacted Training (CRT), a method to train language generation models while protecting the confidential segments. We borrow ideas from differential privacy (which solves a related but distinct problem) and show that our method is able to provably prevent unintended memorization by randomizing parts of the training process. Moreover, we show that redaction with an approximately correct screening policy amplifies the confidentiality guarantee. We implement the method for both LSTM and GPT language models. Our experimental results show that the models trained by CRT obtain almost the same perplexity while preserving strong confidentiality. 
    more » « less
  7. null (Ed.)
    We consider the problem of estimating a function from n noisy samples whose discrete Total Variation (TV) is bounded by C_n. We reveal a deep connection to the seemingly disparate problem of Strongly Adaptive online learning (Daniely et al., 2015) and provide an O(n log n) time algorithm that attains the near minimax optimal rate of ~O (n^(1/3)C_n^(2/3) under squared error loss. The resulting algorithm runs online and optimally adapts to the unknown smoothness parameter Cn. This leads to a new and more versatile alternative to wavelets-based methods for (1) adaptively estimating TV bounded functions; (2) online forecasting of TV bounded trends in time series. 
    more » « less